{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Notebook 8: LangChain Integrations\n",
    "\n",
    "[LangChain](https://python.langchain.com/docs/get_started/introduction.html) is a popular library for developing applications powered by language models. You can use LangChain with LLMs to build various interesting applications such as [Chatbot](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/langchain/transformers_int4/chat.py), [Document Q&A](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/langchain/transformers_int4/docqa.py), [voice assistant](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/langchain/transformers_int4/voiceassistant.py). BigDL-LLM provides LangChain integrations (i.e. LLM wrappers and embeddings) and you can use them the same way as [other LLM wrappers in LangChain](https://python.langchain.com/docs/integrations/llms/). \n",
    "\n",
    "This notebook goes over how to use langchain to interact with BigDL-LLM.\n",
    "\n",
    "## 8.1 Installation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First of all, install BigDL-LLM in your prepared environment. For best practices of environment setup, refer to [Chapter 2](../ch_2_Environment_Setup/README.md) in this tutorial."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install bigdl-llm[all]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then install LangChain."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -U langchain==0.0.248"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **Note**\n",
    "> \n",
    "> We recommend to use `langchain==0.0.248`, which is verified in our tutorial."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 8.2 LLM Wrapper\n",
    "\n",
    "BigDL-LLM provides `TransformersLLM` and `TransformersPipelineLLM`, which implement the standard interface of LLM wrapper of LangChain.\n",
    "\n",
    "`TransformerLLM` can be instantiated using `TransformerLLM.from_model_id` from a huggingface model_id or path. Model generation related parameters (e.g. `temperature`, `max_length`) can be passed in as a dictionary in `model_kwargs`. Let's use [`vicuna-7b-v1.5`](https://huggingface.co/lmsys/vicuna-7b-v1.5) model as an example to instatiate `TransformerLLM`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from bigdl.llm.langchain.llms import TransformersLLM\n",
    "\n",
    "llm = TransformersLLM.from_model_id(\n",
    "        model_id=\"lmsys/vicuna-7b-v1.5\",\n",
    "        model_kwargs={\"temperature\": 0, \"max_length\": 1024, \"trust_remote_code\": True},\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **Note**\n",
    ">\n",
    "> `TransformersPipelineLLM` can be instantiated in similar way as `TransformersLLM` from a huggingface model_id or path, `model_kwargs` and `pipeline_kwargs`. Besides, there's an extra `task` parameter which specifies the type of task to perform."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use a prompt template to format the prompt and simply call `llm` to test generation.\n",
    "\n",
    "> **Note**\n",
    ">\n",
    "> `max_new_tokens` parameter defines the maximum number of tokens to generate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 197,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AI stands for \"Artificial Intelligence.\" It refers to the development of computer systems that can perform tasks that typically require human intelligence, such as visual perception, speech recognition, decision-making, and language translation. AI can be achieved through a combination of techniques such as machine learning, natural language processing, computer vision, and robotics. The ultimate goal of AI research is to create machines that can think and learn like humans, and can even exceed human capabilities in certain areas.\n"
     ]
    }
   ],
   "source": [
    "prompt = \"What is AI?\"\n",
    "VICUNA_PROMPT_TEMPLATE = \"USER: {prompt}\\nASSISTANT:\"\n",
    "result = llm(prompt=VICUNA_PROMPT_TEMPLATE.format(prompt=prompt), max_new_tokens=128)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also use `generate` on LLM to get batch results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "llm_result = llm.generate([VICUNA_PROMPT_TEMPLATE.format(prompt=\"Tell me a joke\"), VICUNA_PROMPT_TEMPLATE.format(prompt=\"Tell me a poem\")]*3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 199,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--------------------number of generations--------------------\n",
      "6\n",
      "--------------------the first generation--------------------\n",
      "USER: Tell me a joke\n",
      "ASSISTANT: Why did the tomato turn red?\n",
      "\n",
      "Because it saw the salad dressing!\n"
     ]
    }
   ],
   "source": [
    "print(\"-\"*20+\"number of generations\"+\"-\"*20)\n",
    "print(len(llm_result.generations))\n",
    "print(\"-\"*20+\"the first generation\"+\"-\"*20)\n",
    "print(llm_result.generations[0][0].text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 8.3 Using Chains"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's begin using LLM wrappers and embeddings in [Chains](https://docs.langchain.com/docs/components/chains/).\n",
    "\n",
    ">**Note**\n",
    ">\n",
    "> Chain is an important component in LangChain, which combines a sequence of modular components (even other chains) to achieve a particular purpose. The compoents in chain may be propmt templates, models, memory buffers, etc. \n",
    "\n",
    "### 8.3.1 LLMChain\n",
    "\n",
    "Let's first try use a simple chain `LLMChain`. \n",
    "\n",
    "Create a simple prompt template as below. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 252,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain import PromptTemplate\n",
    "\n",
    "template =\"USER: {question}\\nASSISTANT:\"\n",
    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now use the `llm` we created in previous section and the prompt tempate we just created to instantiate a `LLMChain`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 201,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain import LLMChain\n",
    "\n",
    "llm_chain = LLMChain(prompt=prompt, llm=llm)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's ask the llm a question and get the response by calling `run` on `LLMChain`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 203,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AI stands for \"Artificial Intelligence.\" It refers to the development of computer systems that can perform tasks that typically require human intelligence, such as visual perception, speech recognition, decision-making, and language translation. AI can be achieved through a combination of techniques such as machine learning, natural language processing, computer vision, and robotics. The ultimate goal of AI research is to create machines that can think and learn like humans, and can even exceed human capabilities in certain areas.\n"
     ]
    }
   ],
   "source": [
    "question = \"What is AI?\"\n",
    "result = llm_chain.run(question)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8.3.2 Conversation Chain"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To build a chat application, we can use a more complex chain with memory buffers to remember the chat history. This is useful to enable multi-turn chat experience.\n",
    "\n",
    "> **Note**\n",
    ">\n",
    "> `ConversationBufferMemory` is a type of memory in LangChain that allows for storing messages from a conversation and extracting them in different formats."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain import PromptTemplate\n",
    "from langchain.chains import ConversationChain\n",
    "from langchain.chains.conversation.memory import ConversationBufferMemory\n",
    "\n",
    "template = \"The following is a friendly conversation between a human and an AI.\\\n",
    "    \\nCurrent conversation:\\n{history}\\nHuman: {input}\\nAI Asistant:\"\n",
    "prompt = PromptTemplate(template=template, input_variables=[\"history\", \"input\"])\n",
    "conversation_chain = ConversationChain(\n",
    "    verbose=True,\n",
    "    prompt=prompt,\n",
    "    llm=llm,\n",
    "    memory=ConversationBufferMemory(),\n",
    "    llm_kwargs={\"max_new_tokens\": 256},\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
      "Prompt after formatting:\n",
      "\u001b[32;1m\u001b[1;3mThe following is a friendly conversation between a human and an AI.    \n",
      "Current conversation:\n",
      "\n",
      "Human: Good morning AI!\n",
      "AI Asistant:\u001b[0m\n",
      "Good morning! How can I assist you today?\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "query =\"Good morning AI!\" \n",
    "result = conversation_chain.run(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
      "Prompt after formatting:\n",
      "\u001b[32;1m\u001b[1;3mThe following is a friendly conversation between a human and an AI.    \n",
      "Current conversation:\n",
      "Human: Good morning AI!\n",
      "AI: The following is a friendly conversation between a human and an AI.    \n",
      "Current conversation:\n",
      "\n",
      "Human: Good morning AI!\n",
      "AI Asistant: Good morning! How can I assist you today?\n",
      "Human: Tell me about Intel.\n",
      "AI Asistant:\u001b[0m\n",
      "Intel is a multinational technology company that specializes in the development and manufacturing of computer hardware and technology solutions. It was founded in 1976 by Robert Noyce and Gordon Moore, and is headquartered in Santa Clara, California. Intel is best known for its microprocessors, which are the \"brains\" of most computers and devices. They also produce other hardware such as motherboard chipsets, flash memory, and other components. Intel is one of the largest and most well-known technology companies in the world, and is a leader in the development of new technologies and innovations in the field of computing.\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "query =\"Tell me about Intel.\" \n",
    "result = conversation_chain.run(query)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8.3.3 MathChain"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's try use LLM solve some math problem, using `MathChain`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **Note** \n",
    "> MathChain usually need LLMs to be instantiated with larger `max_length`, e.g. 1024\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 255,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.chains import LLMMathChain\n",
    "\n",
    "MATH_CHAIN_TEMPLATE =\"Question: {question}\\nAnswer:\"\n",
    "prompt = PromptTemplate(template=MATH_CHAIN_TEMPLATE, input_variables=[\"question\"])\n",
    "llm_math = LLMMathChain.from_llm(prompt=prompt, llm=llm, verbose=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 256,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new LLMMathChain chain...\u001b[0m\n",
      "What is 13 raised to the 2 power"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "13 raised to the 2 power is equal to 13 \\* 13, which is 169.\n",
      "\u001b[32;1m\u001b[1;3mQuestion: What is 13 raised to the 2 power\n",
      "Answer: 13 raised to the 2 power is equal to 13 \\* 13, which is 169.\u001b[0m\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'Answer:  13 raised to the 2 power is equal to 13 \\\\* 13, which is 169.'"
      ]
     },
     "execution_count": 256,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "question = \"What is 13 raised to the 2 power\"\n",
    "llm_math.run(question)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8.4 Question Answering over Docs\n",
    "Suppose you have some text documents (PDF, blog, Notion pages, etc.) and want to ask questions related to the contents of those documents. LLMs, given their proficiency in understanding text, are a great tool for this."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.4.1 Installation\n",
    "\n",
    "Please install the necessary dependency library before running the example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -U faiss-cpu"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.4.2 Load documents\n",
    "\n",
    "For convienence, here we use a text string as a loaded document. Unstructured data can be loaded from many sources. Use the [LangChain integration hub](https://integrations.langchain.com/) to browse the full set of loaders."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {},
   "outputs": [],
   "source": [
    "input_doc = \"\\\n",
    "    BigDL: fast, distributed, secure AI for Big Data\\n\\n\\\n",
    "    BigDL seamlessly scales your data analytics & AI applications from laptop to cloud, with the following libraries:\\\n",
    "        Orca: Distributed Big Data & AI (TF & PyTorch) Pipeline on Spark and Ray\\\n",
    "        Nano: Transparent Acceleration of Tensorflow & PyTorch Programs on XPU\\\n",
    "        DLlib: “Equivalent of Spark MLlib” for Deep Learning\\\n",
    "        Chronos: Scalable Time Series Analysis using AutoML\\\n",
    "        Friesian: End-to-End Recommendation Systems\\\n",
    "        PPML: Secure Big Data and AI (with SGX Hardware Security)\\\n",
    "        LLM: A library for running large language models with very low latency using low-bit techniques on Intel platforms\\n\\n\\\n",
    "    \""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.4.3 Split texts of input documents\n",
    "\n",
    "[Text splitters](https://python.langchain.com/docs/modules/data_connection/document_transformers/) break Documents into splits of specified size. Here, we split the Document into chunks for embedding and vector storage.\n",
    "\n",
    "> **Note**\n",
    "> \n",
    "> `CharacterTextSplitter` will only split on separator (which is `'\\n\\n'` by default).\n",
    ">\n",
    "> `chunk_size` is the maximum number of characters that will be split if splitting is possible.\n",
    ">\n",
    "> `chunk_overlap` is the overlap number of characters between each split."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.text_splitter import CharacterTextSplitter\n",
    "\n",
    "text_splitter = CharacterTextSplitter(chunk_size=650, chunk_overlap=0)\n",
    "texts = text_splitter.split_text(input_doc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.4.4 Create embeddings and store into vector stores\n",
    "\n",
    "After splitting the documents, we need to store the splits for later we can search them based on input query. The most common way to do this is to embed the contents of each split then store the embedding vectors in a vector store. \n",
    "\n",
    "As we known, in Transformers, there are some embedding layers to transform unstructured data to embedding vectors(a list of numbers) to perform various operations with them. The embedding vectors represent the real-world objects and concepts, such as words, documents and so on."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "BigDL-LLM provides `TransformersEmbeddings`, which allows you to obtain embeddings from text input using LLM.\n",
    "\n",
    "`TransformersEmbeddings` can be instantiated the similar way as `TransformersLLM`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from bigdl.llm.langchain.embeddings import TransformersEmbeddings\n",
    "\n",
    "embeddings = TransformersEmbeddings.from_model_id(model_id=\"lmsys/vicuna-7b-v1.5\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After introducing `TransformersEmbeddings`, let's create embeddings and store into vector stores. A vector store takes care of storing embedded data and performing vector search for you. Here we use [Faiss](https://faiss.ai/index.html) as an example, Faiss is a library for efficient similarity search and clustering of dense vectors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.vectorstores import FAISS\n",
    "\n",
    "docsearch = FAISS.from_texts(texts, embeddings, metadatas=[{\"source\": str(i)} for i in range(len(texts))]).as_retriever()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.4.5 Get relavant documents\n",
    "\n",
    "As we mentioned, embedding vectors can be the representation of queries and documents. This representation makes it possible to translate semantic similarity to proximity in a vector space. Thus we can search the document through this similarity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--------------------number of relevant documents--------------------\n",
      "2\n"
     ]
    }
   ],
   "source": [
    "query = \"What is BigDL?\"\n",
    "docs = docsearch.get_relevant_documents(query)\n",
    "print(\"-\"*20+\"number of relevant documents\"+\"-\"*20)\n",
    "print(len(docs))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.4.6 Prepare chain"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.chains.chat_vector_db.prompts import QA_PROMPT\n",
    "from langchain.chains.question_answering import load_qa_chain\n",
    "\n",
    "doc_chain = load_qa_chain(\n",
    "    llm, chain_type=\"stuff\", prompt=QA_PROMPT\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.4.7 Generate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "BigDL is a fast, distributed, and secure AI library for Big Data. It enables seamless scaling of data analytics and AI applications from laptops to the cloud. BigDL supports various libraries, including Orca, Nano, DLlib, Chronos, Friesian, PPML, and LLM. These libraries cater to different use cases, such as distributed Big Data processing, transparent acceleration of TensorFlow and PyTorch programs, scalable time series analysis, end-to-end recommendation systems, and secure Big Data and AI with SGX hardware security. BigDL aims to provide a unified platform for AI and data analytics, making it easier for developers to build and deploy their applications at scale.\n"
     ]
    }
   ],
   "source": [
    "result = doc_chain.run(input_documents=docs, question=query)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llm-langchain",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.17"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
